PLUS:

Positive Unlabeled data simulation

To comprehensively assess the proposed method, we designed a series of simulation studies. Four different scenarios for the noise distribution are simulated that correspond to: population balanced with two classes well separated, or clear balanced scenario; population balanced with two classes not well separated, or noisy balance scenario; population unbalanced with two classes well separated, or clear unbalanced scenario; population unbalanced with two classes not well separated, or the noisy unbalanced scenario. The alteration of population unbalancedness and separation are achieved by designing different propensity score function as follows:

simul_data=PU_data_simulation(p=100,N=200,confident_rate=0.5,scenario='noisy_balance',valid='01')

train_data: Gene expression matrix which has N samples and M variables
Label.obs: Positive Unlabeled for each sample, 1 means true positive label, 0 means unlabeled labels
Sample_use_time: Used in stop criteria, how many times each samples to be used in training process
l.rate: Control how much information from last iteration will be used in next
qq: Quantile of the probability for positive samples, used to determine the cutoff between positive and negtive

Result list contains three elements: pred.y shows the probability for each same to be predicted as positive; cutoff is the reference cutoff to transfer continues probability to binary 0/1 label; pred.coef1 take the variable coefficient used in prediction model.

### The R packages involved in PLUS package
library(PLUS)
library(glmnet)

X=PLUS::example_data$train_data
Label=PLUS::example_data$Label.obs
Prediction=PLUS(train_data=X,Label.obs=Label,Sample_use_time=30,l.rate=1,qq=0.1)

Xiaoyu Lu (lu14@iu.edu)

Ph.D. candidate, Indiana University School of Medicine

Junyi Zhou (junyzhou@iu.edu)

Ph.D. candidate, Department of Biostatistics, Indiana University

Zhou, J., Lu, X., Chang, W., Wan, C., Zhang, C. and Cao, S., 2020. PLUS: predicting pan-cancer metastasis potential based on positive and unlabeled learning

zcslab/PLUS documentation built on June 20, 2020, 1:01 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com